26 research outputs found
Two-Way Neural Machine Translation: A Proof of Concept for Bidirectional Translation Modeling using a Two-Dimensional Grid
Neural translation models have proven to be effective in capturing sufficient
information from a source sentence and generating a high-quality target
sentence. However, it is not easy to get the best effect for bidirectional
translation, i.e., both source-to-target and target-to-source translation using
a single model. If we exclude some pioneering attempts, such as multilingual
systems, all other bidirectional translation approaches are required to train
two individual models. This paper proposes to build a single end-to-end
bidirectional translation model using a two-dimensional grid, where the
left-to-right decoding generates source-to-target, and the bottom-to-up
decoding creates target-to-source output. Instead of training two models
independently, our approach encourages a single network to jointly learn to
translate in both directions. Experiments on the WMT 2018
GermanEnglish and TurkishEnglish translation
tasks show that the proposed model is capable of generating a good translation
quality and has sufficient potential to direct the research.Comment: 6 pages, accepted at SLT202
Tight Integrated End-to-End Training for Cascaded Speech Translation
A cascaded speech translation model relies on discrete and non-differentiable
transcription, which provides a supervision signal from the source side and
helps the transformation between source speech and target text. Such modeling
suffers from error propagation between ASR and MT models. Direct speech
translation is an alternative method to avoid error propagation; however, its
performance is often behind the cascade system. To use an intermediate
representation and preserve the end-to-end trainability, previous studies have
proposed using two-stage models by passing the hidden vectors of the recognizer
into the decoder of the MT model and ignoring the MT encoder. This work
explores the feasibility of collapsing the entire cascade components into a
single end-to-end trainable model by optimizing all parameters of ASR and MT
models jointly without ignoring any learned parameters. It is a tightly
integrated method that passes renormalized source word posterior distributions
as a soft decision instead of one-hot vectors and enables backpropagation.
Therefore, it provides both transcriptions and translations and achieves strong
consistency between them. Our experiments on four tasks with different data
scenarios show that the model outperforms cascade models up to 1.8% in BLEU and
2.0% in TER and is superior compared to direct models.Comment: 8 pages, accepted at SLT202
Take the Hint: Improving Arabic Diacritization with Partially-Diacritized Text
Automatic Arabic diacritization is useful in many applications, ranging from
reading support for language learners to accurate pronunciation predictor for
downstream tasks like speech synthesis. While most of the previous works
focused on models that operate on raw non-diacritized text, production systems
can gain accuracy by first letting humans partly annotate ambiguous words. In
this paper, we propose 2SDiac, a multi-source model that can effectively
support optional diacritics in input to inform all predictions. We also
introduce Guided Learning, a training scheme to leverage given diacritics in
input with different levels of random masking. We show that the provided hints
during test affect more output positions than those annotated. Moreover,
experiments on two common benchmarks show that our approach i) greatly
outperforms the baseline also when evaluated on non-diacritized text; and ii)
achieves state-of-the-art results while reducing the parameter count by over
60%.Comment: Arabic text diacritization, partially-diacritized text, Arabic
natural language processin